Today we will be focusing on the practice of fancy geospatial data analysis and visualization.
Overview
Two quick things.
st_intersection (vs st_filter) for clipping to a boundary
summarizing statistics using clipped data
Libraries
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr 1.1.2 ✔ readr 2.1.4
✔ forcats 1.0.0 ✔ stringr 1.5.0
✔ ggplot2 3.4.3 ✔ tibble 3.2.1
✔ lubridate 1.9.2 ✔ tidyr 1.3.0
✔ purrr 1.0.2
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag() masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
Linking to GEOS 3.11.2, GDAL 3.6.2, PROJ 9.2.0; sf_use_s2() is TRUE
Data
I am going to use an in-class project example from the Farmlands group.
We’ll pull in California Counties for our boundary and farmlands for the dataset of interest.
County boundaries are available at the California open data portal .
Check the name you download it as
if a shapefile you point to the folder directory CA_Counties
if a geojson you point to the file itself
We’ll filter it just for Imperial County in this example of farmlands.
Reading layer `CA_Counties_TIGER2016' from data source
`C:\Dev\EA078_Fall2023\CA_Counties' using driver `ESRI Shapefile'
Simple feature collection with 58 features and 17 fields
Geometry type: MULTIPOLYGON
Dimension: XY
Bounding box: xmin: -13857270 ymin: 3832931 xmax: -12705030 ymax: 5162404
Projected CRS: WGS 84 / Pseudo-Mercator
Let’s look at Figure 20.1 for Imperial County boundary to see if it worked.
Now let’s load the Agland dataset.
Reading layer `AgLand' from data source `C:\Dev\EA078_Fall2023\AgLand.geojson' using driver `GeoJSON'
Simple feature collection with 14729 features and 6 fields
Geometry type: MULTIPOLYGON
Dimension: XY
Bounding box: xmin: -119.478 ymin: 32.65257 xmax: -114.4952 ymax: 35.05472
Geodetic CRS: WGS 84
Summary stats pre-clipping
Let’s make a barchart of the area of farmland in the SCAG planning area.
The Agland file has a variable called ‘Shape_Area’. Let’s summarize the total land in each category on a simple bar chart.
Figure 20.2
Cool!
Clip Farmland data to Imperial County
Previously, I have showed st_filter as the function to use for spatial joins. The default type of spatial join in st_filter is classified as st_intersect.
Let’s show an example of st_intersect being applied to the Agland dataset to see how we might get a different result if we focus on a subset of the data.
When doing spatial joins, it can be important to turn off spherical geometry which is the first bit of code sf_use_s2(FALSE).
Spherical geometry (s2) switched off
although coordinates are longitude/latitude, st_intersection assumes that they
are planar
Warning: attribute variables are assumed to be spatially constant throughout
all geometries
Check the map in Figure 20.3 to see it worked.
palFarms <- colorFactor ( palette = 'Set2' , domain = Imp_ag $ TYPE )
leaflet ( ) |>
addProviderTiles ( provider = providers $ Esri.WorldImagery ) |>
addPolygons ( data = Imp_ag ,
color = ~ palFarms ( TYPE ) ,
weight = 1 ,
fillOpacity = 0.8 ) |>
addLegend ( data = Imp_ag ,
title = 'Farmland type' ,
pal = palFarms ,
values = ~ TYPE )
Cool!
Summary Stats in Imperial County
Now let’s do a bar chart for just the Imperial County data.
Follow the same steps as in Figure 20.2 but apply them to Imp_ag.
Figure 20.4 shows the result.